Generalized likelihood ratio test for voiced/unvoiced decision using the harmonic plus noise model

نویسندگان

  • Etan Fisher
  • Joseph Tabrikian
  • Shlomo Dubnov
چکیده

In this paper, a novel method for voiced / unvoiced decision in speech and music signals is presented. Voiced unvoiced decision is required for many applications, including better modeling for analysis/synthesis, detection of model changes for segmentation purposes and better signal characterization for indexing and recognition applications. The proposed method is based on the Generalized Likelihood Ratio Test (GLRT) and assumes colored Gaussian noise with unknown covariance. Under voiced hypothesis, a harmonic plus noise model is assumed. The derived method is combined with a Maximum A-posteriori Probability (MAP) scheme to obtain a voiced unvoiced tracking algorithm. The performance of the proposed method is tested under the Keele University database for different signal-to-noise ratios (SNRs), and the results show that the algorithm performs well even under severe noise conditions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Uniform concatenative excitation model for synthesising speech without voiced/unvoiced classification

In general, speech synthesis using the source-filter model of speech production requires the classification of speech into two classes (voiced and unvoiced) which is prone to errors. For voiced speech, the input of the synthesis filter is an approximately periodic excitation, whereas it is a noise signal for unvoiced. This paper proposes an excitation model which can be used to synthesise both ...

متن کامل

Approximate Kalman Filtering for the Harmonic plus Noise Model

We present a probabilistic description of the Harmonic plus Noise Model (HNM) for speech signals. This probabilistic formulation permits Maximum Likelihood (ML) parameter estimation and speech synthesis becomes a straightforward sampling from a distribution. It also permits development of a Kalman filter that tracks model parameters such as pitch, harmonic amplitudes, and autoregressive coeffic...

متن کامل

A Statistical Model-Based V/UV Decision under Background Noise Environments

In this letter, we propose an approach to incorporate a statistical model for the voiced/unvoiced (V/UV) speech decision under background noise environments. Our approach consists of splitting the input noisy speech into two separate bands and applying a statistical model for each band. We compute and compare the likelihood ratio (LR) for each band based on the statistical model and estimated n...

متن کامل

Improvements in Speaker Characterization Using Spectral Subband Energy Based on Harmonic plus Noise Model

We previously proposed the use of Spectral Subband Energy Ratio (SSER) as speaker features in a speaker verification system[1]. Those SSER features were derived from two distinct components-the harmonic and noise speech parts, which were decomposed by the Harmonic plus Noise Model(HNM) from the original speech. In this paper, we report several recent improvements to this approach. First, we go ...

متن کامل

Robust automatic continuous-speech recognition based on a voiced-unvoiced decision

In this paper, the implementation of a robust front-end to be used for a large-vocabulary Continuous Speech Recognition (CSR) system based on a Voiced-Unvoiced (V-U) decision has been addressed. Our approach is based on the separation of the speech signal into voiced and unvoiced components. Consequently, speech enhancement can be achieved through processing of the voiced and the unvoiced compo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003